Goto

Collaborating Authors

 backbone model








54801e196796134a2b0ae5e8adef502f-Paper-Conference.pdf

Neural Information Processing Systems

Although recently proposed parameter-efficient transfer learning (PETL) techniques allowupdating asmallsubsetofparameters (e.g. This is because the gradient computation for the trainable parameters still requires backpropagation through thelargepre-trained backbone model.



Dialect Identification Using Resource-Efficient Fine-Tuning Approaches

Lin, Zirui, Gulzar, Haris, Busto, Monnika Roslianna, Masaki, Akiko, Eda, Takeharu, Nakadai, Kazuhiro

arXiv.org Artificial Intelligence

Dialect Identification (DI) is a task to recognize different dialects within the same language from a speech signal. DI can help to improve the downstream speech related tasks even when speakers have a strong dialect. However, fine-tuning a speech model for tasks like DI is expensive in terms of computation cost and memory requirement. Recent studies have explored fine-tuning pre-trained speech models for tasks like DI using Parameter-Efficient Fine-Tuning (PEFT) methods, which offer parameter efficiency but limited improvement in memory efficiency and training speed. To address these challenges, we explore Memory-Efficient Fine-Tuning (MEFT) methods, originally proposed for language processing, and apply them to the general-purpose pre-trained speech model. We then comprehensively analyze the GPU memory usage and fine-tuning speed based on various MEFT methods. As a case study, we fine-tune the Whisper model to identify six Mandarin subdialects from the KeSpeech dataset, reducing GPU memory usage by up to 73.25% and accelerating training speed by a factor of 2.1, while maintaining accuracy comparable to vanilla fine-tuning and PEFT methods.


Mitigating Gender Bias in Depression Detection via Counterfactual Inference

Hu, Mingxuan, Ma, Hongbo, Wu, Xinlan, Liu, Ziqi, Liu, Jiaqi, Chen, Yangbin

arXiv.org Artificial Intelligence

Audio-based depression detection models have demonstrated promising performance but often suffer from gender bias due to imbalanced training data. Epidemiological statistics show a higher prevalence of depression in females, leading models to learn spurious correlations between gender and depression. Consequently, models tend to over-diagnose female patients while underperforming on male patients, raising significant fairness concerns. To address this, we propose a novel Counterfactual Debiasing Framework grounded in causal inference. We construct a causal graph to model the decision-making process and identify gender bias as the direct causal effect of gender on the prediction. During inference, we employ counterfactual inference to estimate and subtract this direct effect, ensuring the model relies primarily on authentic acoustic pathological features. Extensive experiments on the DAIC-WOZ dataset using two advanced acoustic backbones demonstrate that our framework not only significantly reduces gender bias but also improves overall detection performance compared to existing debiasing strategies.